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ABSTRACT 

An overview of contemporary assessment practices 
often used with Latino students is provided, and it is maintained 
that these practices often overlook the influence of culture and 
linguistic proficiency on cognitive performance. A new approach, the 
Sentence Verification Technique (STV) , is proposed as an alternative 
method to assess the linguistic proficiency of language minority 
populations. The STV, developed as a technique for measuring reading 
and listening comprehension, requires the student to develop one of 
four types of test sentences from each sentence appearing in a test 
passage. The four sentence types include: copying an original 
sentence, paraphrasing a sentence, changing the meaning of a 
sentence, and constructing a distractor sentence that is similar in 
vocabulary and syntactic structure to the sentences in the passage 
but unrelated in meaning. The effective application o^ the STV in a 
transitional bilingual education program in HolyoJce, Massachusetts, 
is described, and the future needs and directions of linguistic 
proficiency assessment of Latino students are discussed. (DJD) 



* Reproductions supplied by EDRS are the best that can be made 

* from the original document. 



SRR! p205 



00 

i> 
o 

00 
Q 

Cultural and Linguistic 
Influences on Latino Testing 



Jose P. ^k2stre and Jarnes M. Royer 
University of I-Iassachusetts, Aniierst 

Scientific Reasoning Research Institute 



December, 1988 



PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) " 



OH.r."„ftH^''*"''^''^°'= EDUCATION 



SyThi; 



■ep'Oduced as 
y organization 



''^^is document has Deen 
receded from trie person 
Originating ;f 

r Minor changes Mave e)een made tc .,t'>'ov»» 
'eproduction qualify 



• Points o<vie.v or opin,onss'a:ed.nth.sdocu 
^enr do not necessarily -epresenf offir.ai 
OtPl posifion or policy 



Running Head: Issues in Latino Otesting 

This paper was supported by a grant from the N;>cional Science Foundation, 
NSF ••'BNS- 85 11069. 




BEST COPY AVAILABLE 



CXiltural and Unguistic Influences on Latino lasting 

Ihe juxta..Dosition of two facts foreshadow a prctolem loaning 
in American education. Fact one is that Latiiios are the most 
rapidly grcwing minority group with demDgraphic trends indicating 
that they will be the largest minority grrxip in this country by 
the turn of the century. Fact two is that latinos score 
significanUy belcw the majority population in standard 
assessments of both academic aptitude and academic achievement. 
The challenge for our instructional an3 assessment systems is to 
address the problem described in the secxDnd fact by devising 
approaches to accurately assess and raise the educational 
attainment of latino students. Indeed, this is an important 
challenge since maintaining the coipetitive edge that is crucial 
to the country's eoononic well-beii^ requires a highly ©*x3ted 
workforce. 

What is the cause of the academic und^rachievement of latino 
students in the U.S.? If there were a siirple answer to this 
question we oculd ijnmediately set about making the appropriate 
chaixres in cur educational system. Any reasonable attempt to 
formilate an ansv^- to this question would need to draw on the 
iJTterplay of socio-eoononic, cultural, developmental and 
linguistic factors. Research findings do p:>int to linguistic 
proficiency as the single, most important mrjdiator of academic 
achievement for latino students (Cummins, 1981, 1982; De Avila, 
1988) . Hcwever, we should heed De Avila's warning that 
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"linguistic proficiency in English, alt±iou(^ necessary, does not 
seem to be a sufficient condition ffor academic adiieveinent]" (p, 
116) • 

Because linguistic proficiency is such an inportant mediator 
of academic achievement, the ability to accurately assess 
linguistic proficiency among latino students is of paramount 
iitportance. Although this will be a focal if sue of this chapter, 
it is important to begin with an overview of ways in which culture 
and linguistic proficiency can affect cognitive perfomance, and 
to illustrate hew assessment practices often overlook the 
influence of these two factors. We then describe a new approadi 
for assessing linguistic proficiency, ccQled the Sentence 
Verification Technique (SVT) , that is particularly well suited for 
language minority pcpilatic^, and discuss how this approacJi has 
been used successfully in a transitional bilingual educaticn 
program, Vfe oOTclude with some thou^ts on the future of 
linguistic assessment among Latino students. 

Hie Interplay of CXilture in Cocfnitive Perfomance 

The effect of culture on cognitive perfomance has been 
documented in v2irious studies. At one extreme is the interesting 
exaitple of the Oksaprun culture of Papua, New Guinea. Ihe 
Oksapmin were an isolated stone age culture until the latter part 
of this century v*ien increased contact and trade with western 
civil izaticMi brcuc^t modificaticais to thei^ culture. The 



influence of western culture was perhaps manifest most clearly in 
the Oksapnin nunfcer system (Saxe, 1985) . The Oksapnin nunfcer 
system consists of 27 body parts with no base structure. To count 
in this system, Oksapnin begin with the thunto of one hand and 
recite body part names moving around the i^per peripiiery of the 
body. Although this system was perfectly suitable for their 
cultural needs (e.g., counting a set of valuables, or indicating 
the ordinad positions of two villages along a path) , it was 
unsuitable for dealing with the base 10 system required for 
mcmtary exchange during trading activities. Interestingly, 
observations of Oksapnin ctiildren attending "bush" missicmry 
schools revealed that they were trying to adapt and use their body 
part ocunting system to solve base 10 arithmetic prt±)lems. This 
exairple clearly illustrates hew children use and adapt knowledge 
acquired in the hone culture within an academic setting. 

A similar cultural influence on mathematical problem solving 
can be fcxind among the Ute Native American tribe of northeastern 
Utah (Leap, 1988) . leap found that Ute students often evaluated 
the "truth value" of a particular word problem and worked with 
that prrlDlem accortiing to the findings of that evaluaticMi. For 
exanple, when a Ute student was asked during a problem solving 
interview to determine how much money his brother would have to 
spend on gasoline if he wanted to drive his truck frcxn the 
reservation to Salt Lake City, the student did not attenpt to 
solve the problem based on the information presented in the 
request. Rather, he assessed the truth valu^ of the request and 



answered^ •'My brother does not have a pick-up truck." Accx)rding 
to Leap, such assessments are a conmon part of everyday life in 
the reservation and are related to the Ute language, whicii has 
very precise mechanisnis for evaluating the "degrees of reality" of 
an event. Although it could be argued that the prijrary link to 
this behavior is language and not culture, this is clearly not the 
case: The Ute students in the study were not fluent in the Ute 
Icinguage, although Ute was used in the hone. Ihus, despite their 
laak of fluency in the Ute language, Ute ciiildren adopt tribal 
traditions and use them in theix' approacii to mathematical prdDlem 
solving. 

Among Mexican-Americans, most studies of the role of culture 
on adiieveinent have focused on familism (MacOorquodcile, 1988) . 
Familism refers to the relative inportance of family megdibers in 
detennining an individual's values, goals and orientaticn. Sane 
argue that the family obstructs intellectual development because 
the needs of the family are si^jposedly placed above those of an 
individual family ment)er (Grebler, Moore and Guzman, 1970; Montez, 
1960) . There is evidence that Mexican-Americans vAio cire more 
independent of their families exhibit greater educational 
achievements tl^an those vAio maintain closer family ties (Sctjwartz, 

1971) . Hcwever, others argue that family orientaticxi does not 
interfere with either a^iraticais or aciiievement (Lopez-Lee, 

1972) . Research findings support this view — ^Anglos and Mexican- 
Americans do not differ in educational values and aspiraticxis 
(Aiken, 1979; Espinoza, Fernandez and Domfca^oH, 1977; Juarez & 



Kia^lesky, 1969) . Regardless of v^iich view is cx>rrect, there does 
appear to be a clear difference between Anglo ard Mexican-American 
family values in one area: Mexican-^Anverican parents are more 
traditional in their attitudes tcwcirxi gender roles than Anglo 
parents are; Mexican-American girls are likely to be encouraged to 
p^irsue careers that will not interfere with their future family 
life (MaoOorquodale, 1988) . 

In contrast to what one mi^t expect given the performance 
differences between Anglos and Mexican-Americans in naticml 
assessments, research evidence suggests that Mexican-Americans are 
more likely than Anglos to do hatvework and that Mexican-American 
parents are very supportive of their children's education 
(MacOorquodale, 1984, 1988). These findirqs raise the inter^stirg 
question, '^To what can the differences in educaticx^ attainment 
between Anglos and Mexican-Americans be attributed?" 
MacOorquodale argues that Mexican-American parents are unable to 
translate their encouragement and support into concrete acticxis 
(e.g. helping their children with honework or advising their 
children on what courses to tate in school) , in part due to their 
limited educational background. 

Althou^ it could be argued that the exairples discaised above 
would have an indirect effect on cognitive performance, there is 
evidence that culture can directly affect one of the more 
important instances of cognitive performance: reading 
ocanrprehension. For example, one study investigated the readirg 
oorprehensicHi of students frcxn two distinctly different cultural 
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gtx:Kjps (Americans and Asian Indians) on two stories, one based on 
their own culture and one based on the foreign culture 
(Stef fensen, Joag-Dev & Anderson, 1979) . The two stories 
described a wedding in the U.S. and an Asian Indian wedding. 
After reading passages describing both weddings, subjects were 
able to recall considerably more material that was culturally 
similar to them, and also rated their understanding of the 
culturally fcuniliar material higher. 

\ Similar findings were obtained in a study of people's 

coiprehensicxi of baseball (Voss, VescHider & Spilic±i, 1980) • VJhen 
passages describing the events transpiring during a bas^^all game 
were read by individuals possessing high-knowledge and low- 
kricwledge of the game of baseball, the hi^i-knowledge individuals 

I were better able to recall the salient features of the baseball 

passage. In this exairple, being well-versed in the "cu3.ture" of 
baseball served as a clear advantage in reading cotprehension. 

i similar findings were obtained in a study of children's 

corprdiensicxi of a passage about spiders (Anderscai & Shifrin, 
1980) . It therefore appears that the presence of materieil that is 
culturally foreign to students will adversely affect performance 
on tests, even thcu^ the students may have mastered ti^e skills 
that the test is supposedly testing. 

CXdturally unfamiliar material has an adv(2rse effect not c»ily 
on perfonnarjoe but also on learning. A study of ninth grade 
Latinos learning algebra (Mestre, 1988) revealed that the students 

, were not reading the textbook as a means of 5i?3plemental 
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instructicHi. & '^ause students did not read the textbcx)k, the only 
instruction they received consisted of the in'-class presentaticsns 
by the teacher. Although readings fron the textbook were 
assigned, students used the book only as a place to find pixblems 
assigned for hcnework. Upon closer scrutiny, it became evident 
that the context of the material in the textbook was largely 
unfemiiliar to the students. When student were asked to explain 
the meaning of terms appearing in a typical section of the book, 
such as "shares of stock," "revolving charge account," '^monthly 
payments," and "interest," it became clear that they had little 
idea what these terms meant. The context of the material in the 
textbook mi^t be suitable for students fron middle or hi^ 
socioeocxiomic backgrcunds, but it was totally unsuitable for the 
lew socioeomonic backgrounds of the latinos in that eilg^ra 
class, a criticism that we are not the first to make (Taylor, 
1978) . 

The Interplay of Lancaiaqe in Oocgiitive Perfomance 

The lack of language proficiency can adversely affect 
cognitive performance through a variety of avenues, sonej more 
blatant than others. It is fairly easy to detect the rnorx^ obvious 
ways in whicii language adversely affects cognitive performance, 
such as the case of a student asked to perform a task in a 
lainguage with which she or he is not proficient. We will focus 
cur discussictfi on the more subtle avenues through whicii language 
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proficiency affects cognitive performance, since these ar^ not as 
easy to detect. 

Before proceeding further it is inportant to define language 
proficiency since studies often attribute the poor performance of 
language minority students to lack of language proficiency without 
ever assessing it (De Avila, 1988) ; clearly these studies equate 
language proficiency with ethnicity. We will define a lax^g^^age 
minority student to be proficient in the secord language if that 
student's language proficiency level is equivalent to the 
"average** monolingual student of the same age. Vfe take it as 
self-evident that a language minority student who is not 
ptpficient in the secc^ language, as we have defined the term, 
will not perform as well on a task requiring language proficiency 
sJcills as will a mainstream mcavDlingual student. The next obvious 
issue to oOTsider is hew likely it is for a language minority 
student to be proficient . 

Most bilingual programs in this country are " transit ic^l" 
programs, meaning that the school system has 3 years to bring tlxO 
seocxxJ language proficiercy level of the students to a level \4iere 
they can be mainstreamed. Hcwever, there is evidence that this is 
not nearly enouc^ tma to make students proficie nt. Ind Icaticris 
are that it takes between five and seven years for language 
minority stxxlents to approacii language proficiency levels 
equivalent to their mcM>Dlingual peers (Cuninins, 1982) . 
Nevertheless, often lar^uage minority students are mainstreamed 
before they are proficient , and many subsequenUy exhibit below 
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average perfomance in mainstream classrxxtns ard ai assessiwants of 
acacJemic achievement. The reason they frequently have difficulty 
can be fcxand in the relationship between language proficiency and 
cognitive performance. 

After approximately two years of exposure to Eriglish, many 
language minority students display sufficient skills in English to 
be able to ocnrunicate quite adequately with their monolingual 
peers in face-to~face situations. This type of "proficiency" is 
frequently presumed to be sufficient to allow students to enter 
the mainstream curriculum and conpete favorably with mc«K)lir^ual 
students, a presunption that is inaccurate. AooorxUng to Curamins 
(1980, 1981, 1982) it is inportant to distiixfuish between the 
language proficiency needed for face-to-faoe caimunicaticxTs and 
the language proficiency needed for acadanic work. F^oe-tx>-faoe 
comunication is "context-embedded" in the sense that tl^ere cure 
many cues to aid comanicatic^ (e.g. gestures and intonatiai) . On 
the other hand, much academic work takes place within a "context- 
reduced" situaticHi v*iere the cues present in face-to-face 
cxwrunication do not exist. For example, in atteirpting to read 
and ocnprehend a corplex text all a student has to go cxi are the 
words in the text. Ability to ocmnunicate in context-taifcedded 
situaticns will not necessarily help a language miiyDiity student 
perform in a context-reduced academic task. 

This was the situation experienced by the first author, who 
was mainstreamed after spending two and aie-+icilf yeeirs in a 
bilingual program. He was able to comtmirate tolerably well with 
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his monolingual peers, and even perform at "above average" level 
in school. In those cases where he did not understand the 
teacher's instruction or assigned work, he would turn to a 
neighboring student and ask for clarificatiai. Thus, by forcing a 
context--reduced situation into a context-embedded situatiai, he 
was able to function quite well. Hcwever, it was in tests 
requiring substantial linguistic proficiency (both in-class and 
standardized) that problems arose. In these oxitext-reduced 
assessments, the only cues available were the words on the paper. 
Particularly problematic were tasks where a passage was to be read 
folla«3ed by a series of "cotprehension" questicMis; if severcil key 
wonds in either the passage or the questions were unfamiliar, 
there was nothing to do but guess and hope for the best. 

In additim to the above unsolicited testimcxiial , there :^ 
research evidence of hew limited English proficiency adversely 
affects perfonnance on context-reduced academic tasks. One study 
with latino college students revealed that sane error patterns in 
solving math word problems v^ere the result of language 
deficiencies; students were working out the problems incorrectly, 
tut consistent with their interpretaticHi (Mestre^ 1986a) . Similar 
evidence has led Dawe (1984) to hypothesize that, in order Lo 
perform well in cognitively demanding matheapatical tasks, students 
must reach a threshold level of proficiency both in mathematical 
knowledge (i.e. mathematical oono^ts and how they are applied) 
and in the language used to express that knowledge. Ihis language 
that is specific to mathematics has been termed the '♦mathematics 
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register*' and is oatprised of the variety of language oriented to 
mathematics activities, including the various linguistic forms, 
along with their meanings and uses (Halliday, 1975; Spanos, 
Rhodes, Dale & Crandall, 1988) . Whereas it was previously thought 
that mathematics was a domain in \4uch language prof i -:iency played 
a relatively uninportant role, recent research suggests that this 
is not the case (Kintsch & Greeno, 1985) . To be able to achieve 
in mathematics, students not oily must be proficient in the 
English language, which is used throughout mathematics, but also 
nust be proficient with the mathematics register, vAiich defines 
the specific uses of the English language within mathematics. 

Another exanple of hew language interacts with cognitive 
performance emerged in a study of premise oorprdiensiwi with both 
Anglo and Latino college students (Mestre, 1988) . Findir^ frtm 
this study revealed that all students were using rtdes that govern 
the ccrprehensioi of natural discourse to interpret premises, 
rather thcui the rules of logic. For exanple, students were 
inclined to interpret the premise, "not all clerks are male" to 
mean "sane clerks are male, rather than the appropriate 
interpretatiwi, "seme clerks are female" (the statement "not all 
clerks are male" is consistent with all clerks beir^ fsiiaie, thus 
it is incorrect to assume that seme clerks are male) . Although 
there were no significant differences between Anglos and Latinos 
in performance, an interesting difference emerged in evaluating an 
intervention strategy designed to improve performance. 

Ihe interventicxi strategy was a thirty-minute videot£^>3d 
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lesscxi which covered the rules for parsing and interpreting 
premises cxHitaining different nuinber and types of negaticxis. 
Imnediately following the lesson, all Anglo students reached 
ceiling level performance on a post test assessing their ability 
to interpret premises. One week following the lesson, all .^^lo 
subjects had retained a ceiling level performanoe. Six mcxiths 
later, 93% of the Anglo subjects had retained ceiling level 
performance^ The pattern for the latino subjects was different — 
onLy 65% reached ceiling level performance immediately following 
the lesson. Among those latinos had reaciied ceiling level 
perfontar>oe after the lesscai, only 38% were able to attain ceiling 
level performance six months after the lesscai. Ihis di^>arity in 
learning and retention from a ccxitext-reduoed, linguist iceilly 
laden lesscxi wais attributed to the di^>5u:ity in English 
proficiency between the Anglo and latino groups; the average SAT- 
Verbal score of the Anglo group was a full 170 points hi^ier than 
that of the Latino group. Ihis study illustrates both the short 
and lcx>g--terra effects of linguistic deficiencies cm learning arid 
performance in academic situations. 

Issues of Concern With latino Assessment 

Ihe researdi reviewed in the previous two sectic«Ts 
suggests that there are ^)ecicLL consideration involved in the 
ocxistruction, use and interpretatic^i of assessment instruments 
amcng Latino students, peu^iculaorly vihen tho^^ students are in the 
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transition stage frxm first to second language proficiency. But 
it is equally evident that current assessment procedoxes are not 
respcxjsive to these consideraticMis. As Cumtiins states, "Most 
minority language students are still . . . assessed . . . with 
assessment tools and procedures that were designed cxily for 
children frcm the majority Anglo group," (1982, p. 1). These 
assessment instruments necessarily reflect the values and culture 
of both those who design the instruments and the mainstream 
pcpulatiai. For exairple, a passage in a reading cotrprehensiOT 
test about a nature hi3os through the New England woods in winter 
may bear no overlap wi+-h the experiences of latino students 
living in Puerto Rico or southern Texas. The latino students may 
lack the mental schemata that facilitate interpreting and making 
sense exit of that passage (Cabello, 1984; Steffensen, et al., 
1979; Voss, et cil., 1980). 

Often, poor performance on these assessment instruments by 
bilingual students is used to reinforce myths, such as 
"bilingualisra causes language handicaps," or "the best way for 
bilu.gual students to make progress in the second language is to 
eradicate the first language." ihere is research evidence that 
these itryths are false (Cummins, 1981; Kessler, 1987; leao, 1988). 

F\irther, atteirpts to "patch up" assessment instruments 
designed for the majority population to make them suitable for 
latino pc^xilations are often flawed. The most camon patch-»^ 
consists of translating a test into Spanish. This practice places 
the student in double jeopardy— not only will the cultural values 
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of the test remain intact, but there is now the risk that the 
transiatiOT will not be adequate for the target group (Cabello, 
1984) ; for exanple, differences in the level of vocabulary acrxiiss 
the two versiOTis of the test, and the multiplicity of translations 
for particular wonJs (e.g. "kite" could be translated as 
"papagayo/' "coneta" or "papelote" d^^ending on the country of 
origin of the translator) , are just two of several possible 
limitaticxTS in translated tests (Wilen & Sweeting, 1986) . 

The analysis presented thus far suggests that current 
assessanoent procedures may be appropriate for latino students when 
they have acquired acadeonic-level proficiency in English and when 
they have acquired sufficient familiarity with American culture, 
but they are inadequate ^nDcedures for assessment during the 
period when the students are gaining seocHid language proficiency 
and are being aculturated. Given tx>e shortocrdngs with current 
assessment procedures, what vrould the ideeil assessment procedure 
look lite? Ihe literature reviewed in the previous secticans 
sugjest tv^ guidelines. First, the ideeil assessment procedure 
wculd be sensitive to cultural experiences. This suggests that 
the ideal procedure wo(u].d be tailor-inade to suit the cultural 
backgrtxmd of the target population. Mexican-American stixJents in 
southern California might be assessed using one instrument, cind 
children of Cuban background in Florida assessed with another. 
Second, the pnx3edure shculd be sensitive to level of lar^uage 
proficiency. That is, the procedure should be able to trace the 
developnent of second language proficiency frun the acquisition of 
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"fac3e-to-faoe" corpetenoe to the nastery of academic level 
proficiency. 

In the section to follow we describe an assessment procedure 
that attenpts to be responsive to these guidelines. The first 
part of the section will present a general overview of the 
technique. Hiis will be followed by sections that describe how 
the technique is responsive to the guidelir>es set forth, ai;^ data 
will be presented that addresses tlie validity of the procedure as 

t 

f a measure of progress in a transitional bilingual education 

program. 

Overview of the Sente nce Verification Itec^nig up 

The Sentence Verification Technique (SVT) is a recently 
developed technique for measuring reading and listenir^ 
ocitprehension that was first introd-jced by Royer, Hastirgs, and 
Hook (1979) . The technique entails developing am of four types 
of test sentences frcn each sentence appearing in a text passage. 
The first type of test sentence is called an original and it is a 
cxpy of a sentence as it appeared in the passage. The second type 
of test sentence, called a paraphrase , is constructed by charging 
as many words as possible in an origjjial sentence without altering 
the meaning of the sencenoe. The third type of test sentence is 

^ "waning change, and it is constrticted by changing one or 
two words in the sentence so that the meaning of the sentence is 
altered. The final Jcind of test sentence is called a distractor 
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and it is a sentence that has a vocabulciry level and syntactic 
structure that is similar to sentences in the passage and is 
ccxisistent with t±ie overall thenve of t±ie text passage, tut the 
sentence is unrelated in neaning to any sentence that appeared in 
the passage. 

An SVT test consists of a set of passages, eacii of vtiicii is 
followed by a set of test sentences. Eacii set of test sentences 
co^ists of equal nuntoers of each of the test sentence types. So, 
for exanple, irost of the research using the SVT has used 12 
sentence passages and either 12 test sentences (3 eacli of the test 
sentence types) or 16 tests sentences (4 eacii of the test sentence 
types) . An examinee taking an SVT test reads or listens to the 
passage, and then in the absence of the text judges eacii of the 
test sentences to be "old" or "new*' (the eleamentary school 
versions of SVT tests have recently started using "yes" and "no" 
as substitutes for old and new) . Old (yes) sentences are defined 
as sentences that are the same as or mean the same as passage 
sentences (origirals and paraphrases) and new (no) sentences have 
a different meaning than passage sentences (meaning changes and 
distractors) . More details on the development and administration 
of SVT tests can be found in Royer, Greene, and Sinatra (10C7) , 
and in Royer (in press) . 
Tlieoretical Rationale for the SVT 

The use of a verificaticn technique as a measure of 
ccrprehension was shaped to a considerable degree by the 
theoretical assunption that corprehension is a "ccxistructive" 
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process that results in a memory representation that preserves the 
meaning tut not the form of a linguistic message, ihe 
oonstnictivist tlieoretical framework (e.g. , Brown, Bransford, 
Ferrara, & Cairpione, 1983; di:5it)io, 1982; Jenkins, 1974, 1979; 
Kintsch & van Dijk, 1978; Royer, 1985; Royer & Cunningham, 1981) 
asserts that the process of corprehensicHi entails an interacticxi 
between ccxitext, the linguistic message, and the knowledge base of 
the listener or reader. This interacticMi results in the 
constnxrtioi of an interpretation of a linguistic message that 
preserves the meaning but not the surface stnK±ure of the 
message. This process of forming a memory representatioi is 
thought to occur more or less simultaneously with the r^c^icxi of 
the message (e.g., Carroll, 1972; van Dijk & Kintscii, 1983), ani 
it is largely unconscicxjs exc^ in instances where processing 
difficulties are encountered (e.g., Kintsch & van Dijk, 1978). 

Oc»istructive theory suggests that the "product" of 
cdtprehension is a memory r^resentaticxi that preserves the 
meaning of a linguistic message. This perspective, in turn, 
siK^jests that coBrprehension could be measured by deterniining if 
readers or listeners had successfully established a meaning 
preserving meanory r^resentaticxi of sonething they had read or 
heard. The SVT was designed to accorplish rhis purpose. 

If readers have coipreherded a text and established meaning 
preserving memory representations of that text, then they should 
be able to correctly judge that original and paraphrase test 
sentences have the same meaning as their memory r^resentations, 
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and they should be able to correctly reject ineaning change and 
distraojor test sentences as having a meaning different than their 
memory r^resentaticxis. However, if a reader has not successfully 
established a meaning-preserving memory representaticni he or she 
should have great difficulty in correctly classifying the test 
sentences as having the same or a different meaning thcin a text 
sentence. 

Scoring and Interpretation of SVT Tests 

Two procedures have been used in scoring SVT tests. 
first is a sitrple corputaticxi of proportion correct. Prcportic^ 
connect can be coiputed for overall performance, for performance 
an separate passages, for performance chi particuleu: sentences 
within a passage, and for particular test sentence types (e.g. , 
originsLLs, paraphrase,, etc.) within or across passages. Most of 
the research on the SVT has entailed calculating properties 
correct, thou^ for sane purposes a more sophisticated test 
scoring procedure utilizing the theory of signal detecticMi (TSD) 
(Swets, Tanner, & Birdsall, 1961) may yield more useful data. 'The 
scoring of SVT tests using TSD parameters may be a particularly 
attractive property if SVT scores were to be used as an index of 
the absolute conpr^iensicHi of a passage. Ihe scoring of SVT tests 
using TSD parameters is descxibsd in P^/er (Ln press) . 

Research corpleted to date indicates that average readers get 
about 75% of the SVT items comect if the tests are based on 
material at grade level. With re^^ect to listening performance, 
students con typically understand difficult material better when 
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listening than when reading, but the difference between listening 
and reading cxnprehension diminishes the students get older. 
The Reliability and Validity of SVT Tests 

Ihe reliability of SVT tests has been assessed in a number of 
studies (Marchant, Royer and Greene, 1988; Royer, Kulhavy, Lee, 
and Peterson, 1986; Royer & Hainbleton, 1983; Royer, Tirre, 
Sinatra, and Greene, in press) involving both children and 
adults. In the studies involving children, 1150 students in the 
grades 3 through 8 were tested. The average reliability for SVT 
reading tests in these studies was over .9 and the reliability for 
listening tests was over .7. The studies involving adults 
assessed U.S. Air Force enlistees and college students. The 
reliabilities of the reading tests, which were shorter than those 
used in the children studies, were in the range of .6 to .7. 

The research assessing the Vcdidity of the SVT as a ineasure 
of ccnprehensi'Dn has been cmducted with Messick's (1980) 
observation in mind that test validity is ultimately a matter of 
construct validity and that "...construct validaticMi is a 
ccHTtinucxis, nevea>-ending process developing an ever-expanding 
mosaic of research evidence" (Messick, 1980, p. 1019) . The 
construct validatic»i research on the SVT is briefly outlined 
belcw. 

• The SVT is sensitive to text readability (Royer, Hastings, 
& Hook, 1979, E>q)erinvents 1 & 2; Royer & Hairtoletc^, 1983; Royer, 
Kulhavy, Lee, i Petersc^i, 1986) . 

• The SVT is sensitive to differences in reading 
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skill (Royer, et al., 1979, Experiment 2; Royer, et al., 1986; 
Rasool & Royer, 1986; Royer & Hambleton, 1983). 

• The SVT is sensitive to texi: characteristics (Royer, et 
al, 1984, Experiments 2 and 3). 

• SVT performance varies cis a functicai of working meanory 
capacity (lyncii 1986; 1987), 

• The SVT measures passage corprehensicffi, not just sentence 
cotprehensicai (Royer et al. 1984 Experiinent 4). 

• SVT tests have good divergerit and convergent validity 
properties (Royer, 1986) • 

• The SVT can measure both listening and reading 
oarprehensicxi (Royer, et al., 1986; Royer, Sinatra & Schumer, in 
sufcmissiai; Rcyer, Carlisle, Walbaum & Carlo, 1987; Carlo, Sinatra 
& Royer, 1988) . 

• SVT tests measure educational gain (Experinents 1 and 2 in 
Royer, Lynch, Hambleton & Bulgarelli, 1984; Royer et.al., 1987; 
Carlo et.al., 1988; Sinatra et.al., 1988). 

• SVT tests predict future learning performance (Royer, 
Marchant, Sinatra, & Lovejoy, in sutmissicHi; Royer, Abranovic & 
Sinatra, 1987; Marchant, Royer, & Greene, 1988). 

• SVT tests have diagnostic utility (Carlisle, in 
sutinissicffi; Royer, Sinatra, & Schumer, in jsubmissiw) . 

Virtues of the SVT as a Meagjire of Language 
Qcrorehension Performance with Latino Students 
In an earlier secticri of this paper it was suggested that 
assessment procedures for use with Latino students involved in 
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transitional bilingua.1 education prograns should be sensitive to 
differences in cultural background, ard they should be able to 
assess the degree of second language corpetence attained by the 
Latino student. In this section we will ccx^ider the ways in 
which the SVT procedure meets these guidelines. 
Developjjxf tests that are sensitive to cultural background 

Fran our perspective, local test develofment is the key to 
developing tests that are sensitive to cultural background. Local 
teachers and parents are the best judges of vAiether students have 
the cultural background that would allow them to understand a 
particular set of materials. Thus, teachers (possibly in 
conjuncticai with parents) could select matericils that they believe 
to be consistent *-^th the cultural experiences of their pupils and 
tests can be based on those matericils. 

Ihe SVT lends itself quite well to this perspective. SVT 
tests can be based on virtually any text material. Moreover, it 
is easy to train local school personnel to develop sVT tests. 
This circumvents many of the prcblems associated with different 
uses of language among speakers of a language who have varying 
backgrounds. 

As an instance of local SVT test development, the second 
author of this paper is involved in a research project on the 
Island of Grenada that is assessing the inpact of Oorputer 
Assisted Instruct ic»i in a developing country. This evaluation 
effort included developing reading caiprehension tests for grades 
2 thrcagh 7 suitable for use in Grenada. TtXi national language in 
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Grenada is English, but there are striking differences between 
English as spoken and used in the U.S., and English as ^ken ard 
iised in Grenada. Ihe prxx:ess of test dev'elopinent involved 
training Gn:Tiadian teac ^ers to develop SVT tests and ^hen having 
them develcp tests based on materials selected frxan locx*l sources. 
The tests developed by the teachers have been administered twice, 
and all of the evidence indicates the tests are valid ijndicators 
of progress in Grenadian schools. 

* The Grenadian exairple described above, and the bilingual 

assessment study to be described later in this chs^3ter, 
demonstrate hew the SVT can be responsive to cultural background 
issues. The tests can be based materials that local persc^inel 
judge to be ocxisistent with the background experiences of the 

I students, and the tests can be developed by people having 

linguistic and cultural experiences that are the same as the 
tatget populaticxi to be assessed. These procedures should result 

' in culturally fair tests. 

Develooina tests that assess degree of second lancgjage competence 
Earlier in this paper a distinct icai was made between face-to 
face ocrpetence in a second language and academic cotpetence in 
that language (Cummins, 1980; 1981; 1982) . It was sugcested that 
accurate decisictfis about which of these levels of cotpetence had 
been attained was critical to the correct placement of latino 
students. Those v*io have attained academic cotpetence in English 
may benefit most frxxn placement in mainstream classroons. 
However^ those have only mastered English at the face--to-'face 
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cxnpetenoe level my undergo unnecessary educational harxiship 
throuc^ placement in mainstream classroctns. 

SVr tests provide a means of assessing the degree of language 
oonpetenoe attained b^^ students in the process of learning 
English. This is aooanplished by administering both listening and 
reading oaiprehension tests. In our SVT stixlies with Latino 
students we have routinely administered listening emd reading 
tests in both English and Spanish. The purpose of admin.istering 
listening and reading tests in both languages is to attain a 
oonprehensive portrait of a student's overall language ability. 
The listening axti reading tests in Spanish provide indices of 
native language ccnprehension ability (perfomance on the 
listening tests) and reading cotpr^iensiOT ability. The 
performanoe on the English listening tests prtjvide indices of 
taoe-to-face linguistic corpetenoe, and performcinoe on the reading 
tests provide indices of academic corpetence in English. In our 
studies we have used audio tape recorders as a means of 
administering the listening tests. But if one wanted to increase 
the extent of "context-cues" present in the assessment session, 
the tests could be administered by having someone read the 
passages and testis. 

This secticMi has indicated hew SVT tests can be responsive to 
scr^e of the needs associated with assessing Latino students in the 
process of learning English. The next section will examine 
evidence that the tests are in fact fulfilling these needs. 
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Using SVT Tests to Track the Educational 
Progress of Students Enrolled in Bilincpjal Education Procr ran^s 

This section will report the results of studies caipleted in 
the Holyoke, Massachusetts, Public School System that evaluated 
tlie validity of SVT tests as measures of educational progress for 
students enrolled in a transiticml bilingual educatiai (THE) 
program. Ihe proceduxal details of the study will be described 
very briefly in this chapter. Hie reaaer interested in more 
detail on procedural matters and a more corprehensive description 
of the researcii results can find them in Carlo, Sinatra, & Royer, 
1988; Royer & Carlo, 1988; and Royer, Carlisle, Walbaum, and 
Carlo, 1988. 

The Holyoke, Massachusetts, Public School Systean enrolls 
approximately 6,500 students, about 20% (1400) of whan are 
enrolled in TEE programs. The large majority of the THE stud uits 
are native speakers of Spanish ard virtually all of them are 
Puerto Rican. The school system has two types of TBE programs. 
The first is a traditional model involving six steps {ler/els 
I,IIA, IIB, lie, III, mainstream) and the secord is a "two-way" 
model in which TBE students spend considerable in-class time with 
native English ^^eaking students. 

Stulents with little or no corpetance in English are placed 
in Level I of tJyj TBE program vAiere they receive all of their 
cxyritent instruction in their native language, combined with 
English as a Second Language (ES"^) instruction. As they acquire 

ERLC 



26 

oatpetenoe in English, subject matter instnjction in English is 
phased in beginning with nathematics, tlien science, then social 
studies, and finally reading, -me phasing in process in 
mathematics, science and social studies occurs during Level II 
(A,B, and C) of the program, and English reading instruction is 
encountered in l£vel III. Students are mainstreamed after the 
school systems judges them to be corpetent in English. 

When Spanish speaking students enter the system they are 
tested with the Bilingual Syntax Measure (BSM) and interviewed. 
Evaluations of student progress are conducted twice a year. The 
criteria for exit frcm a THE classrocm to a mainstream class is a 
satisfactory score on the B»l and .i grade equivalent score on the 
English form of the California Achievement Test (CAT) that is at 
least equivalent to the students current grade placonent. This 
latter requirement has been very difficult for many of the Spanish 
speaking students to achieve. Moreover, many of the native 
speakers of English in the school system do not meet the 
requirement of grade level performance the CAT. 
M ethodology 

The test develcpnent phase of the studies began with having 
teachers from the school system select reading textbooks that 
oould serve as the basis for SVT tests in English and Spanish. The 
books selected by the teachers were either in use in the system or 
they were judged to be very representative of the type of reading 
material both the TBE and mainstream stixJents would be likely to 
enccunter in their classrxxms; o*- 
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xfte research team then selected passages fron the textbooks 
and edited those passages so that they were 12 sentences in 
length, and they were "coherent" in that they had a beginning, 
middle, and end. ihe oohererK3e editing was undertaken to avoid, 
as iruch as possible, the sense that the passages were excerpts 
taken fron a larger text. 

The next step involved developing SVT tests based on the 
passages, ihe tests were developed in accordance with the 
procedures described in Royer, Greene, & Sinatra (1987) , and Foyer 
(in press) . ihe Spanish tests were developed by a graduate student 
v4k) is fluent in both English and Spanish and who is a native of 
Puerto Rioo. The English tests were developed by graduate 
students v*jo are native speakers of English. 

The listening and reading tests were developed using passages 
drawn frcci the same text source, when selecting passages for 
inclusion in the tests two passages were selected that were jix^ 
to be parallel in difficulty. One of these passages then becains 
part of the reading test and the second passage was included in 
the listening test. Tests for a particular grade level were 
constructed by "bracketing" a grade level. Ihat is, the tests for 
a given grade consisted of passages that were thought to be easier 
than the reading ocnpetence of the average reader, passages that 
were thought to be equal to the readir^ skill of the average 
reader, and passages that were thcxight to be more difficult than 
the reading skill of the average reader. 

After the tests wejne developed they were n-tumed to the 
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teachers for review and cxiticism. Follcwing this review, final 
changes in the tests were made, and the listening tests in both 
Spanish and English were recorded on audio tape by the bilingual 
graduate student and the reading tests were prq^ared in booklet 
form. 

Ihe SVr tests have been administered to two student cxSiorts. 
The first cohort consisted originally of 1x5 fifth grade TBE 
students who were administered SVT tests in F^ruary and May, 
1987, and in June 1988. The second ochort ooisisted originally of 
120 students enrolled in the TBE program and to 260 students 
enrolled in mainstream classrcxnis. The seccxid cohort, which 
COTsisted of students in grades 3, 4, 5, and 6, was adniinistered SVT 
tests in Decejnber 1987, and the tests were re-administered in 
June, 1988. Mainstream students were < Iven only the English 
listening and reading tests. The data r^»rted here ccme primarily 
frcxn the second cohort. However, there are several issues that 
will be examij>ed using data fron the first cohort. 

Oollecticyi of ancillary information . The school system 
ocnpleted an information questionnaire for every student 
participating in the studies. This questionnaire asked for: 1) 
student age, 2) date of entry into scivDol systan, 3) Lau category, 
4) TBE Level at the tiire the SVT tests were administered, 5) grade 
level of currently assigned reading book, 6) available 
standardized test scores, and 7) language spoken at hone. 

In addition to the informatiOT provided by the school system, 
the teacher of every student participating ir the study r?*ted each 
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of his or her stixJents on their listening and reading 
cxuprehension. Teachers of THE students rated their students on 
listening and reading cxirprehensicxi proficiency in both Spanish 
and English. Mainstream teacliers rated their students cxily in 
English. 

Indices of Test Validity 

Hie purpose of the study was to assess whether SVT tests were 
valid indicators of educational progress for students enrolled in 
a bilingual education program. Evidence consistent with the 
oonclusicxi that the tests were valid indicators of educaticml 
progress wculd be present if the results followed the patterns 
described below. 

1) Perfonnanoe of the TBE students on the English tests 
should vary in accordance with TBE Level. That is. Level I 
students should score lower than Level II students. Level II 
students should score Icwer than Level III students, and Level III 
students should score Icwer than native speakers of Spanish v*io 
are in mainstream classrooms. 

2) Performance of both TBE and mainstream students should 
vary on the reading tests as a function of level of assicrod 
reading book* For instance, if one TBE student in the 4th grade 
is reading in a Spanish book at the 3.0 grade level, and a second 
student is reading in a 4.0 book, the seccHid student should 
receive a higher score on the Spanish reading SVT test than the 
first student. 

3) Perfonnanoe of both the TBE and tJ»e mainstream students 
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should vary in acx»rdanoe witli teacher ratings of oarpetence. As 
an instance, stucJents rated as being highly conpetent in English 
reading should score higher on the English reading SVT tests than 
students rated at a inoderate degree of ocnpetenoe, and students 
with moderate ratings should score higher that students with low 
ratings. 

There are two other issues that can be examined in the 
context of the research effort- Both of these issues provide part 
of the underlying ratic^iale for the beneficial effect of bilingual 
educatiOT programs. The first issue is the assunptioi that 
listening ooipetence precedes reading corpetence. This assunption 
underlies the decision to provide ESL instruction prior to 
beginning systematic instruction in reading in English. 

The second assunpticn is that academic skills acquired in one 
language will transfer to a secOTd language v*>en that larquage is 
acquired. Belcw are fonnal statements of these assunptions in 
terms of e^qpectaticais about SVT performance. 

4) THE StucJents will have better scores on Er>glish listening 
tests than they will have on English reading tests during the tinvs 
they are acquiring corpetence in English- 

5) Performance on the Spanish reading tests will he 
predictive of future skill on the English reading tests. 
Results 

SVT perforTnan oe on the English tests as a function of TBE 
level. The Holyoke school systan advances children throu^ levels 
of the TBE program based on perceived increaf^js in corpetence in 



31 

English. Therefore, if the SVT tests in Er^glish were valid 
indicators of educational progress for the TBE students it should 
be the case that SVT perfomanoe should vary in accordance with 
TBE level . Figure 1 presents the results oorrespcxxJing to this 
expectation. The data in the figure are average proportion 
correct on the SVT tests. The data is sunmed over the grade level 
of the students for ease of presentation, but graphs drawn 
s^>arately for eacii of the grade levels d^ict essentially the 
saine pattern as that presented in Figure 1. 
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The data in Figure 1 shew the performance of the TBE students 
on the English SVT listening and reading tests administered in 
Deceniber, 1987 (labeled ET^ LIS I and ING RDG I in the grs^) , and 
Hay, 1988 (LIS and RDG II in the graph) . In additicMi to showing 
data for TBE students at eacii level of the TBE prx^grain (the A, B, 
and C divisions in level 2 of the TBE program have been 
collapsed) , the graph also indicates the performance of Latino 
students in mainstream classrocms on the same test^, and it 

indicates th"^ nor-Fr^r-marv^o r\-f r^a-f-T^ro errv^ a Irov^ /-^-F TTWvT < r^K -f-K*^ 

tests. The latino students in mainstream classroans are students 
whan the teachers list as having Spanish as the language spoken at 
hone. Many of these students were undoubtedly graduates of the 
TBE prrgram, but it was not possible to sort out former TBE 
students frcrt Spanish speaking students who hp.d never entered the 
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TBE ptx)gram. 

As can be seen in the figure, there is a clear relatic^iship 
between plaoemei^t in the educatioral program and performance on 
the SVT tests. Perfonnance on both the listening and the readirg 
English SVT tests increases as a fanction of advanceanent thrxx>gh 
the TBE program. It is also of interest that the mainstream 
Spanish ^^eaking students are perfoming at approximately the same 
level as are the native English speaking students. 

Another rrs ilt of interest is the fact that listening 
performance is consistently superior to reading performance. 
This result is congruent with the previously men 'ioned hypothesis 
that context cues associated with speech assist the understanding 
of linguistic material. Moreover, it is consistent with the 
educaticHial goal of atterrpting to develop oral understanding of 
English thrcw^ ESL classes prior to beginning formal reading 
instructim in English. 

The only result in the graph that is scmev^iat puzzling is the 
decline in reading performance from the first to the seccrd 
testing occasion for the students in level three of the TBE 
program. This is curicxis given that every other group in the 
study shewed a gain in performance fran the first to ti)e sccord 
testing occasion. One reason that this mi^t have happened is 
that the bulk of these students were in one fifth grade classroon 
and the examiner r^rted that these students were particularly 
unruly on the day tiie tests were ac^ministered. 

The data pi^esented in Figure 1 reports perfonnance on the 
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English tests. Similar analyses of perfonnance on the Spanish 
tests have been conducted and they show no relationship between 
test performance and TBE level. 

Performance on the SVT tests as a functicHi of teacher 
assessments of listening and reading coroetence . The teachers of 
the students who participated in the study were asked to rate 
every student with respect to their listeniiij and reading 
ccnprehension skills. These ratings were made on a 1 tc 9 scale 
with the ratings of 1 and 9 on the scale being anchored by the 
very best student in the class (rating of 9) and the very worst 
student in the class (rating of 1) . After the teac±iers had made 
their ratings the scales were collapsed to form three categories: 
low (ratir^ 1-3) , nedium (ratii^ 4-6) , and hi^ (ratii^ 7-9) . 
Students in the TBE program were rated on both English and Spanish 
skills, whereas mainstream students were rated c«ily on English 
skills. 

If the SVT tests were vcdid indices of educaticml progress, 
performance on the listening and reading tests should vary in 
accordance with teacher judgments of corpetence. Figure 2 
presents the data frxm the Spanish listenir^ and readirg tests. 
The data in the figure is average performance on the tests suirmed 
over grade. Similar graphs drawn for each grade s^^arately show 
patterns nuch the same as the averaged data presented in Figure 2. 
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As can be seen in the figure, performance on both the 
listening and the reading tests increases as a function of teacher 
judgments of oorpetence. Graphs similar to the one presented in 
Figure 2 have been plotted for mainstream students and for TBE 
students v^en rated on English skills and when tested with English 
tests The data for mainstream students is even more striking 
than that for TBE students. That is, there is a clear 
oorrespcxidence between performance on the tests and teacher 
judgments of coipetence. 

The graph d^icting the relationship between the THE 
students' English .SVT performance and teacher jixigments were not 
nearly as orderly. The likely xeason for this is that the TBE 
classrocms were segregated to seme extent by TBE level. Ihat is, 
sane classrocms had a pr^xanderanoe of students at upper levels of 
the TBE program v^ereas others ccaitained mostly students at lower 
TBE levels. This resiilted in teachers makii^ very fine ccnpetence 
discriminatic«Ts. In seme cases teacher were judging degrees of 
ocnpetenoe on a 1 to 9 scale between students having virtiially no 
English coipetence. In other cases teachers made discriminations 
an the same scale between students judged to be near 
mainstreaming. The net result was little relationship between the 
teacher judgments and SVT test performance. 

Another interesting aspect of the data presented in Figure 2 
is that there is a greater correspcrdenoe between teacher ratings 
of reading oonpetence and SVT reading performance than there is 
ratings of listening ocrpetence and SVT list'uiing performance. 

o Q ; 
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This data parallels other data that we .lave collected with 
mainstream students, 'feaciiers seem to be better judges of who are 
their better readers in their classrocms than they axB judges of 
who are the better listeners • This probably reflects the 
generally greater eit^iiasis placed on r^sading as an educaticral 
goal. 

SVT performance and reading book level . One of the items of 
information collected in the study was the level of the reading 
book in whicii the students were currently working. If SVT reading 
tests were accurately measuring reading skill it should be the 
case that test performance varied in accordance with the level of 
the cissigned reading book. Ihe data on reading book level does 
not lend itself to summarizaticMi over grades because of variations 
in differing numbers of reading book levels r^resented at each 
grade. Given this, the data for only grade 6 will be presented. 
The graphs for the other grades shew the same relationship between 
performance on the tests and reading book level. The data in 
Figure 3 shews the performance of grade 6 TBE sbjdents on the 
Spanish SVT tests. The reading book levels are in grade units 
(i.e., 4.0 is a beginning 4th grade level textbook) and are the 
levels of Spanish reading books in which the students are working. 
Very few of the TBE students were receiving reading instruc±.icn in 
English books. 
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The figure shows a clear relationship between performance on 
the SVr reading tests and the level of the readirg book in which 
the students are currently working. It is also interesting to 
note a relationship between the level of the reading book the 
students are working in and their perfonnance on the Spanish 
listening tests. This result is again similar to that found in 
other studies (e.g., Royer, Sinatra, and Schumer, in submission) 
v*iere listening ability of good readers typically exceeds the 
listening ability of poor readers. 

Graphs similar to the one presented in Figure 3 have been 
examined for mainstream students. They look very similar to 
Figure 3 in that there was a clear relaticxiship between the level 
of the English reading book the mainstream stixients were working 
in and their performance on the English listenirg and rBadir^ 
tests. 

Does linguistic conpetence transfer ? One of the irost 
fundamental assunptions underlyirg transitional bilingual 
education programs is that educational skills acquired in a native 
language will transfer to a developing second language 
(e.g., Cuimiins 1983, 1984; Hakuta, 1986). Despite the iirportance 
of this issue, very little enpirical reseorxii has been do?>? in 
this area as noted by Hakuta in the following statanent: 

•T*iat is remarkable about the issue of transfer of skills is 
that de^ite its fundamental import^Jx:e, almost no enpirical 
studies hcive been conducted to understand the 
characteristics or even to demcMistrate the existence of 
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transfer of skills'^ (1986, p. 218) . 

Ihe secmd author can verify Hakuta's observation. A very 
thorough search of the literature has not turned up a single 
empirical demonstration of educational skills acquired in a native 
language transferring to skill in a second Icinguage. 

The data acquired fran the students tested in February and 
May 1987, and in June, 1988 can be used to evaluate the transfer 
issue. One-hundred fifteen fifth grade TEE students were tested 
in February, 1987. Unfortunately, only 49 of the original 115 
students were available for testing in June, 1988, and itany of 
these students did not have a carplete data record. Table 1 
presents the results of pairwise coiTelaticml analyses which were 
cotputed using all available data. Ihis means that the N 
cc»Ttributing to each correlation will vary. As can be seen at the 
bottan of the table, the smallest N cctfitributing to a correlation 
was 29. 
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Ihe most interesting data in the table involves the 
relatic^Tships between the other test scores and perfontanoe on the 
English listening and reading tests administered in June, 1988 
{Erg Lis 3 and Eng Rdg 3 in the table) . As can be seen in the 
table, the cxily significant predictor of listening 3 SVT 
performance was listening 2 SVT performance, and that the next 
best predictor was listening 1 SVT performanoe. This means that 
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the best predictor of the TBE students' English listening 
competence in June, 1988 was their English listening corpctence 12 
months before, and the next best predictor was their English 
listening conpetence 17 months before. 

The predictions of listening perfonuance can be ccxitrasted to 
the predictions of readLng performance. As the table si\ty^, the 
best predictor of English reading performance was Spanish reading 
performance the year before. Spanish reading perfonnance was a 
better predictor than previous English reading perfonnance. 

These data must be interpreted cautiously given the small 
number of students in^/olved and given the magnitude of the 
relationships as illustrated by the relatively lew correlational 
coefficients. But the data certainly suggests that the English 
listening coipetenoe that studeJits acquire in and exit of school 
translates into increasing ability to understand English, and they 
suggest that reading skill accpjired in Spanish transfers to 
reading skills in English as the students irrproved their 
corpetence in English. Ihe data collected in DeceniDer, 1987 ard 
June 1988 frcn the much larger student cohort will be valuable in 
assessing the transfer issue in the future. 
Summairy of the Empirical Studies 

In the ecirly part of this section three checks on the 
validity of SVT tests as indices of progress in bilingual 
education prograit^ w^re suggested. The first check was that 
performance on the English versions of the tests should vary in 
accordance with placement of students in the TT3E program. The 



39 

second check was that perfoniance on the tests should vary in 
acxxirdance with teacher judgments of conpetence. And the third 
check was that perfomance on the reading tests should vary in 
acxx)rdance with the grade le/el of the assigned reading book. 

Perfomanoe on thJ SVT tests were congruent with all three of 
these expectations. The TBE students inproved in performance on 
the Ei>glish listening and reading tests as they advanced thrxxK^ 
levels of the TBE prrxjram, performance on the tests varied in 
accordance with teacher expectations of listening and readiryg 
corpetenoe, and performance on reading tests were better for 
students assigned upper level reading books. 

In addition to providing evidence regarding the validity of 
SVT tests in a bilingual context, the studies that have been 
conducted thus far provide sane support for assunptions underlying 
TBE programs. Specifically, the results indicated that early 
ccxipetence in listening and understanding English was related to 
Englishi listening ooipetence a year later, and that reading 
coipetence in Spanish was predictive of subsequent reading 
competence in English. These results are consisterit with the 
practices of providing ESL instniction as part of a TBE program, 
ard with providirjg reading instruction in the native language 
during the period of acquiring conpetence in the secmd language. 
Future Directions in SVT Assessment 

Wc>rk that is currently underway will expand the scope of SVT 
assessment in three ways. First, we are examining whether SVT 
assessment is useful in content areas. In :* pilot study using 7th 
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grade TBE and nviinstream students, SVT tests have been developed 
based on mateiuals drawn from science and socxal studies text 
sources. If this study demonstrates that bVr tests can be used to 
assess the degree to which students can read and understand 
content area materials we will expand the scope of this work to 
enconpass more grade levels and content ar^ (e.g., mathematics). 
This research will build upon previous research which has shown 
that SVT test performance on content materials can be used to 
predict learning perfomance in that content area (Royer, 
Abranovic & Sinatra, 1987; Royer, Marchant, Sinatra & Lavejoy, in 
submission) . 

A second future concern is to develop better ways to train 
local school personnel to develop their own SVT tests. One 
approach to S^v^ training that we are currently workirg on involves 
develcpijTg ccn^xiter-based trainirrj programs. Two programs are 
under develcfment. Hie first is naired the SVT Ttest Maker (Walczyk 
& Royer, in submission) . The SVT Test Maker is a program that 
automates many of the details associated with SVT test 
develcpnent. The test develc^ types in the text that is to 
serve as the basis for the SVT test, and the program parses the 
text and then asks the developer several question about how the 
test shculd be arranged. Aft^or answering these questions the 
program presents the developer with each sentence in the original 
passage and indicates that a test sentence of a particular type 
should be developed based on the presented sentence. The 
develcper then types in ^he test sentence. After each test 
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sentence has been develcped, the program autcmatically fomats the 
test, and prints it cut, carplete with instructions, in a form 
that is ready to be reproduced and administered. The SVT Test 
Make r currently exists only in an English version, but after 
further refinements our intent is to develop it in a Spanish 
version. 

Ihe second corputer program will be called the SVT Trainer . 
This program will be a tutorial that will train local school 
personnel to develop SVT tests. It will include an introduction 
to SVT testing, a brief description of prior research, and 
considerable practice with appropriate feedback on the development 
of SVT test items. The trainee will also receive instruction on 
constructing SVT tests, administering SVT tests, scoring SVT 
tests, and interpreting test results. 

The final SVT project that is envisioned for the future is a 
ociTpjter based reading diagnostic system t±Lat will include SVT 
listening and reading tests. Ihis diagnostic system is designed 
to assist in diagnosing reading difficulties in students who do 
not seem to benefit fran normal reading instruction. As with the 
other systems, the diagnostic system could be developed in Spanish 
as well as English. 

Concluding Remarks 
We view latino testing as being divided into two separable 
caicems. The first concern is that tests used to assess latino 
students be sensitive to cultural and linguistic influences during 
the time the stu dejit is ac quiring se cond lanrrtag o proficienc y and 
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beijig_acultur^. i^e second concern is that tests be developed 
that §r^^si'st^ticailiLMas^.a^ 

hgve.dcveloEed secg^^ .^i^,^ 
dGgr^_o f_acgjltur3Uon . 

Itie first of the above concerns is critical to the effective 
education of latino students and we believe that it can a Jy be 
addressed at the local level. The introduction to this chapter 
reviewed evidence that culture and language can influence 
cognitive performance in general and test perforiience in 
particular. In our vie^ there is no way that a single test can be 
truly fair to latino students having differing cultural 
experiences and linguistic traditions. Our solution to this 
dilemma is to argue for tests that are tailor-irade to fit the 
cultural and linguistic experiences of the target population. 

until recently, the argument for local based assessment would 
have been a hopeless expectation. Teacher-nade tests are 
notorious for their poor reliability and validity', and local 
school systems do not have the resources to meet the enontcus 
costs of developing reliable and valid tests using traditional 
ps>'chaietric procedures. We have presented evidence that svr 
tests provide one means of meeting the need for local based 
assessment at a cost that Icxal school systems can easily afforri. 
We are also confident that once the idea that quality local 
assessment is possible, other pnx^ires will be de^.-loped to meet 
this need. 

We also believe that progress is being r.de on the 
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develcpn^ent of stardardized tests that are not systematically 
biased against minority students. For exanple, one area in which 
test niakers are iraking considerable progress is in test bias 
detection. Educational Testing Service has recently iirplemented a 
major effort for inonitoring test bias through a technique called 
"differential item functioning." Ihis technique coipares the 
performance (at the test item level) of "target grcwps" (wonen. 
Blacks, latinos. Native Americans and Asian Americans) against 
that of the majority group. If either a '.arget grcxip or the 
majority group displays differentially lower performance on an 
item, that item undergoes close scrutiny; follcwirxg the scrutiny, 
the item may be deemed inapprc^i. late and not counted in the test. 

In additicai, recent positive steps have been taken by test 
makers to help latino stiidents in the areas of test familiarity 
arxi test-taking strategies. Standardized tests, such as the SAT, 
have very clear fonnats. Familiarity with the format of the 
various sections will reduce both the anxiety' that a stjdent will 
experience in baking the test and the time that student iTtUst spend 
figuring out what a particular test section is ask^^ For 
exanple, the SAT-math has a section called "quantitative 
cciTparisons" where the test taker must cotpare two quantities and 
decide if: a) the quantity in tJie first column is less than the 
quantity in the second column, b) the quantity in the first column 
is greater than the quantity' in the second column, c) the two 
quantities are equal, or 4) there is not enough information to 
determine the relative sizes of the two qc-dntities. A student 



walking into the SAT without being extremely familiar with the 
four answer categories for these items will be at a clear 
disadvantage since that student will have to spend COTsiderable 
time looking back at the four options before responding to the 
items. 

Further, there are sound test takii^ strategies that can help 
any student's perfomance in standardized midtiple choice tests. 
For exanple, students who are familiar with the optimal "guessing 
strategies" in those types of tests that correct for possible 
guessing by subtracting a percentage of the incorrect answers frtxn 
the correct answers will likely show a hi^er performance than 
students who are unaware of these strategies. One sucJi strategy 
consists of instructing students in the advantage of guessing in 
those cases where the possible choices have been narrowed down to 
two or three from a field of five, while a ccntplementary strategy 
consists of explaining to students why wild guessing will not help 
in the slightest. Finally, a wise strategy in timed iiultiple 
(dioioe tests is not to spend too much time on any single item — a 
strategy whicdi if ignored can increase the speededness of a test. 

As in the case of monitoring possible item bias, Educationcil 
Testing Service and the College Board are taking an iriitiative in 
helping Latino students in hoth of these areas. A recent 
Educational Testing Service-<i3ilege Board publicatioi entitled 
"Preparing for the PSAT NMSQT for Latino High School Students" 
addresses these two specific issues for the Preluninary Scholastic 
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totitude Test. 

De^ite these encxxuraging trends, there is a need for ifore 
research focusing on testing issues with minority populations, as 
well as more cxxperation and ocrrenonicaticHi between resecirchers ard 
test ina}<Ders. For exanple, tv.'o areas vAiere more research should be 
conducted with minority populations are test speededness and test 
validity • Because language minority students often read slower 
than majority students, they are likely to reach fewer items in a 
timed test and thereby exhibit a poor performance (Mestre, 1986b) ; 
it is therefore inpDrtant to have a thomx^ understanding of the 
possible adverse effect of speededness on test performance for 
language minority students. 

Test validity, v^icii refers to whether or not a particular 
test is in fact an accurate measure of what it purports to 
measure, can be an unwieldy researcii area. It is conceivable that 
a test oculd be a valid instrument for a language minority group 
yet be insensitive in the cultural/linguistic dimensicn, or 
conversely that a test could be designed to be culturally ard 
linguistically sensitive for a language minority group yet not be 
valid for that group. What is clear fran existing research on 
validity is that testes whicii appear to be good predictois of 
future cognitive performance for majority students are not as good 
for predicting performance for minority students (Dalton, 1974; 
Hedges & Majer, 1976; Haaston, 1980; McComack, 1983; Mestre, 
1981) . These studies suggest that caution must be exercised in 
how we interpret ^.est scores among minority populatiois. 
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Ihe ijipressicHi with which we hope to leave the reader is that 
despite the enomojs problems that renain in the ar^a of testing 
language minority pcpalations, progress is being made, Vfe believe 
testing procedures now exist that will allow local school 
personnel to develop tests that are sensitive to the cultural and 
linguistic backgrounds of their students. We also believe that 
standardized tests developers are very ocxx:emed about 7gf3y=>s of 
test bias and test validity as it relates to the assessment of 
minority populations, and that st^ are being taken to reduce 
the possibility that standardized tests are biased and invalid 
measures of the abilities of minorities. 
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